conditional probability density
Estimating Conditional Probability Densities for Periodic Variables
Most of the common techniques for estimating conditional prob(cid:173) ability densities are inappropriate for applications involving peri(cid:173) odic variables. In this paper we introduce three novel techniques for tackling such problems, and investigate their performance us(cid:173) ing synthetic data. We then apply these techniques to the problem of extracting the distribution of wind vector directions from radar scatterometer data gathered by a remote-sensing satellite.
Learning Human-Inspired Force Strategies for Robotic Assembly
Scherzinger, Stefan, Roennau, Arne, Dillmann, Rüdiger
The programming of robotic assembly tasks is a key component in manufacturing and automation. Force-sensitive assembly, however, often requires reactive strategies to handle slight changes in positioning and unforeseen part jamming. Learning such strategies from human performance is a promising approach, but faces two common challenges: the handling of low part clearances which is difficult to capture from demonstrations and learning intuitive strategies offline without access to the real hardware. We address these two challenges by learning probabilistic force strategies from data that are easily acquired offline in a robot-less simulation from human demonstrations with a joystick. We combine a Long Short Term Memory (LSTM) and a Mixture Density Network (MDN) to model human-inspired behavior in such a way that the learned strategies transfer easily onto real hardware. The experiments show a UR10e robot that completes a plastic assembly with clearances of less than 100 micrometers whose strategies were solely demonstrated in simulation.
Diffusion Models Made Easy
In the recent past, I have talked about GANs and VAEs as two important Generative Models that have found a lot of success and recognition. GANs work great for multiple applications however, they are difficult to train, and their output lack diversity due to several challenges such as mode collapse and vanishing gradients to name a few. Although VAEs have the most solid theoretical foundation however, the modelling of a good loss function is a challenge in VAEs which makes their output to be suboptimal. There is another set of techniques which originate from probabilistic likelihood estimation methods and take inspiration from physical phenomenon; it is called, Diffusion Models. The central idea behind Diffusion Models comes from the thermodynamics of gas molecules whereby the molecules diffuse from high density to low density areas.
Semi-supervised Conditional Density Estimation for Imputation and Classification of Incomplete Instances
Incomplete instances with various missing attributes in many real-world scenes have brought challenges to the classification task. There are some missing values imputation methods to fill the missing values with substitute values before classification. However, the separation between imputation and classification may lead to inferior performance since label information are ignored during imputation. Moreover, these imputation methods tend to initialize these missing values with strong prior assumptions, while the unreliability of such initialization is rarely considered. To tackle these problems, a novel semi-supervised conditional normalizing flow (SSCFlow) is proposed in this paper. SSCFlow explicitly utilizes the observed labels to facilitate the imputation and classification simultaneously by employing a semi-supervised algorithm to estimate the conditional probability density of missing values. Moreover, SSCFlow takes the initialized missing values as corrupted initial imputation and iteratively reconstructs their latent representations with an overcomplete denoising autoencoder to approximate the true conditional probability density of missing values. Experiments have been conducted with real-world datasets to demonstrate the robustness and efficiency of the proposed algorithm.
Conditional Density Estimation with Neural Networks: Best Practices and Benchmarks
Rothfuss, Jonas, Ferreira, Fabio, Walther, Simon, Ulrich, Maxim
Given a set of empirical observations, conditional density estimation aims to capture the statistical relationship between a conditional variable $\mathbf{x}$ and a dependent variable $\mathbf{y}$ by modeling their conditional probability $p(\mathbf{y}|\mathbf{x})$. The paper develops best practices for conditional density estimation for finance applications with neural networks, grounded on mathematical insights and empirical evaluations. In particular, we introduce a noise regularization and data normalization scheme, alleviating problems with over-fitting, initialization and hyper-parameter sensitivity of such estimators. We compare our proposed methodology with popular semi- and non-parametric density estimators, underpin its effectiveness in various benchmarks on simulated and Euro Stoxx 50 data and show its superior performance. Our methodology allows to obtain high-quality estimators for statistical expectations of higher moments, quantiles and non-linear return transformations, with very little assumptions about the return dynamic.
The Computational Power of Dynamic Bayesian Networks
Bayesian networks are probabilistic graphical models that represent a set of random variables and their conditional dependencies via a directed acyclic graph. Explicitly modeling the conditional dependencies between random variables permit efficient algorithms to perform inference and learning in the network. Causal Bayesian networks have the additional requirement that all edges in the network model a causal relationship. Dynamic Bayesian networks are the time-generalization of Bayesian networks and relate variables to each other over adjacent time steps. Dynamic Bayesian networks unify and extend a number of state-space models including hidden Markov models, hierarchical hidden Markov models and Kalman filters. Dynamic Bayesian networks can also be seen as the natural extension of acyclic causal models to models that permit cyclic causal relationships, while avoiding problems with causal models that try to model temporal relationships with an atemporal description [1]. A natural question is what is the expressive power of such networks. The result in this paper shows that although discrete dynamic Bayesian networks are sub-Turing in computational power, introducing continuous random variables with discrete children is sufficient to model Turing-complete computation.
Ensembles of Protein Molecules as Statistical Analog Computers
A class of analog computers built from large numbers of microscopic probabilistic machines is discussed. It is postulated that such computers are implemented in biological systems as ensembles of protein molecules. The formalism is based on an abstract computational model referred to as Protein Molecule Machine (PMM). A PMM is a continuous-time first-order Markov system with real input and output vectors, a finite set of discrete states, and the input-dependent conditional probability densities of state transitions. The output of a PMM is a function of its input and state. The components of input vector, called generalized potentials, can be interpreted as membrane potential, and concentrations of neurotransmitters. The components of output vector, called generalized currents, can represent ion currents, and the flows of second messengers. An Ensemble of PMMs (EPMM) is a set of independent identical PMMs with the same input vector, and the output vector equal to the sum of output vectors of individual PMMs. The paper suggests that biological neurons have much more sophisticated computational resources than the presently popular models of artificial neurons.
Mixtures of Gaussian Processes
We introduce the mixture of Gaussian processes (MGP) model which is useful for applications in which the optimal bandwidth of a map is input dependent. The MGP is derived from the mixture of experts model and can also be used for modeling general conditional probability densities. We discuss how Gaussian processes -in particular in form of Gaussian process classification, the support vector machine and the MGP modelcan be used for quantifying the dependencies in graphical models. 1 Introduction Gaussian processes are typically used for regression where it is assumed that the underlying function is generated by one infinite-dimensional Gaussian distribution (i.e.
Mixtures of Gaussian Processes
We introduce the mixture of Gaussian processes (MGP) model which is useful for applications in which the optimal bandwidth of a map is input dependent. The MGP is derived from the mixture of experts model and can also be used for modeling general conditional probability densities. We discuss how Gaussian processes -in particular in form of Gaussian process classification, the support vector machine and the MGP modelcan be used for quantifying the dependencies in graphical models. 1 Introduction Gaussian processes are typically used for regression where it is assumed that the underlying function is generated by one infinite-dimensional Gaussian distribution (i.e.
Mixtures of Gaussian Processes
We introduce the mixture of Gaussian processes (MGP) model which is useful for applications in which the optimal bandwidth of a map is input dependent. The MGP is derived from the mixture of experts model and can also be used for modeling general conditional probability densities. We discuss how Gaussian processes -in particular in form of Gaussian process classification, the support vector machine and the MGP modelcan beused for quantifying the dependencies in graphical models. 1 Introduction Gaussian processes are typically used for regression where it is assumed that the underlying functionis generated by one infinite-dimensional Gaussian distribution (i.e.